AITopics | lebesgue measure 0

Collaborating Authors

lebesgue measure 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

9062b7d6e522dadf4f7d85d49b60d81e-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 15:27:51 GMT

artificial intelligence, machine learning, neural network, (16 more...)

Neural Information Processing Systems

Country:

Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
North America > United States > Illinois (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

9062b7d6e522dadf4f7d85d49b60d81e-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 09:31:00 GMT

lebesgue measure 0, neural network, theorem 6, (14 more...)

Neural Information Processing Systems

Country:

Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
North America > United States > Illinois (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning high-dimensional targets by two-parameter models and gradient flow

Yarotsky, Dmitry

arXiv.org Machine LearningFeb-26-2024

We explore the theoretical possibility of learning $d$-dimensional targets with $W$-parameter models by gradient flow (GF) when $W

learning high-dimensional target, lebesgue measure 0, neural network, (9 more...)

arXiv.org Machine Learning

2402.17089

Country:

North America > United States > Illinois (0.04)
Europe > Spain > Galicia > Madrid (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Unconstrained optimisation on Riemannian manifolds

Truong, Tuyen Trung

arXiv.org Machine LearningAug-31-2020

In this paper, we give explicit descriptions of versions of (Local-) Backtracking Gradient Descent and New Q-Newton's method to the Riemannian setting.Here are some easy to state consequences of results in this paper, where X is a general Riemannian manifold of finite dimension and $f:X\rightarrow \mathbb{R}$ a $C^2$ function which is Morse (that is, all its critical points are non-degenerate). {\bf Theorem.} For random choices of the hyperparameters in the Riemanian Local Backtracking Gradient Descent algorithm and for random choices of the initial point $x_0$, the sequence $\{x_n\}$ constructed by the algorithm either (i) converges to a local minimum of $f$ or (ii) eventually leaves every compact subsets of $X$ (in other words, diverges to infinity on $X$). If $f$ has compact sublevels, then only the former alternative happens. The convergence rate is the same as in the classical paper by Armijo. {\bf Theorem.} Assume that $f$ is $C^3$. For random choices of the hyperparametes in the Riemannian New Q-Newton's method, if the sequence constructed by the algorithm converges, then the limit is a critical point of $f$. We have a local Stable-Center manifold theorem, near saddle points of $f$, for the dynamical system associated to the algorithm. If the limit point is a non-degenerate minimum point, then the rate of convergence is quadratic. If moreover $X$ is an open subset of a Lie group and the initial point $x_0$ is chosen randomly, then we can globally avoid saddle points. As an application, we propose a general method using Riemannian Backtracking GD to find minimum of a function on a bounded ball in a Euclidean space, and do explicit calculations for calculating the smallest eigenvalue of a symmetric square matrix.

artificial intelligence, backtracking gd, machine learning, (16 more...)

arXiv.org Machine Learning

2008.11091

Country:

Europe > Norway > Eastern Norway > Oslo (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(5 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Coordinate-wise Armijo's condition

Truong, Tuyen Trung

arXiv.org Machine LearningNov-18-2019

Let $z=(x,y)$ be coordinates for the product space $\mathbb{R}^{m_1}\times \mathbb{R}^{m_2}$. Let $f:\mathbb{R}^{m_1}\times \mathbb{R}^{m_2}\rightarrow \mathbb{R}$ be a $C^1$ function, and $\nabla f=(\partial _xf,\partial _yf)$ its gradient. Fix $0<\alpha <1$. For a point $(x,y) \in \mathbb{R}^{m_1}\times \mathbb{R}^{m_2}$, a number $\delta >0$ satisfies Armijo's condition at $(x,y)$ if the following inequality holds: \begin{eqnarray*} f(x-\delta \partial _xf,y-\delta \partial _yf)-f(x,y)\leq -\alpha \delta (||\partial _xf||^2+||\partial _yf||^2). \end{eqnarray*} When $f(x,y)=f_1(x)+f_2(y)$ is a coordinate-wise sum map, we propose the following {\bf coordinate-wise} Armijo's condition. Fix again $0<\alpha <1$. A pair of positive numbers $\delta _1,\delta _2>0$ satisfies the coordinate-wise variant of Armijo's condition at $(x,y)$ if the following inequality holds: \begin{eqnarray*} [f_1(x-\delta _1\nabla f_1(x))+f_2(y-\delta _2\nabla f_2(y))]-[f_1(x)+f_2(y)]\leq -\alpha (\delta _1||\nabla f_1(x)||^2+\delta _2||\nabla f_2(y)||^2). \end{eqnarray*} We then extend results in our recent previous results, on Backtracking Gradient Descent and some variants, to this setting. We show by an example the advantage of using coordinate-wise Armijo's condition over the usual Armijo's condition.

armijo, backtracking gd, critical point, (15 more...)

arXiv.org Machine Learning

1911.0782

Country:

Europe > Norway > Eastern Norway > Oslo (0.05)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

Convergence to minima for the continuous version of Backtracking Gradient Descent

Truong, Tuyen Trung

arXiv.org Machine LearningNov-13-2019

The main result of this paper is: {\bf Theorem.} Let $f:\mathbb{R}^k\rightarrow \mathbb{R}$ be a $C^{1}$ function, so that $\nabla f$ is locally Lipschitz continuous. Assume moreover that $f$ is $C^2$ near its generalised saddle points. Fix real numbers $\delta_0>0$ and $0<\alpha <1$. Then there is a smooth function $h:\mathbb{R}^k\rightarrow (0,\delta_0]$ so that the map $H:\mathbb{R}^k\rightarrow \mathbb{R}^k$ defined by $H(x)=x-h(x)\nabla f(x)$ has the following property: (i) For all $x\in \mathbb{R}^k$, we have $f(H(x)))-f(x)\leq -\alpha h(x)||\nabla f(x)||^2$. (ii) For every $x_0\in \mathbb{R}^k$, the sequence $x_{n+1}=H(x_n)$ either satisfies $\lim_{n\rightarrow\infty}||x_{n+1}-x_n||=0$ or $ \lim_{n\rightarrow\infty}||x_n||=\infty$. Each cluster point of $\{x_n\}$ is a critical point of $f$. If moreover $f$ has at most countably many critical points, then $\{x_n\}$ either converges to a critical point of $f$ or $\lim_{n\rightarrow\infty}||x_n||=\infty$. (iii) There is a set $\mathcal{E}_1\subset \mathbb{R}^k$ of Lebesgue measure $0$ so that for all $x_0\in \mathbb{R}^k\backslash \mathcal{E}_1$, the sequence $x_{n+1}=H(x_n)$, {\bf if converges}, cannot converge to a {\bf generalised} saddle point. (iv) There is a set $\mathcal{E}_2\subset \mathbb{R}^k$ of Lebesgue measure $0$ so that for all $x_0\in \mathbb{R}^k\backslash \mathcal{E}_2$, any cluster point of the sequence $x_{n+1}=H(x_n)$ is not a saddle point, and more generally cannot be an isolated generalised saddle point. Some other results are proven.

critical point, generalised saddle point, saddle point, (15 more...)

arXiv.org Machine Learning

1911.04221

Country:

Europe > Norway > Eastern Norway > Oslo (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
(4 more...)

Genre:

Research Report (0.50)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.41)

Add feedback

Reparameterizing Distributions on Lie Groups

Falorsi, Luca, de Haan, Pim, Davidson, Tim R., Forré, Patrick

arXiv.org Machine LearningMar-7-2019

Reparameterizable densities are an important way to learn probability distributions in a deep learning setting. For many distributions it is possible to create low-variance gradient estimators by utilizing a `reparameterization trick'. Due to the absence of a general reparameterization trick, much research has recently been devoted to extend the number of reparameterizable distributional families. Unfortunately, this research has primarily focused on distributions defined in Euclidean space, ruling out the usage of one of the most influential class of spaces with non-trivial topologies: Lie groups. In this work we define a general framework to create reparameterizable densities on arbitrary Lie groups, and provide a detailed practitioners guide to further the ease of usage. We demonstrate how to create complex and multimodal distributions on the well known oriented group of 3D rotations, $\operatorname{SO}(3)$, using normalizing flows. Our experiments on applying such distributions in a Bayesian setting for pose estimation on objects with discrete and continuous symmetries, showcase their necessity in achieving realistic uncertainty estimates.

artificial intelligence, exp, machine learning, (17 more...)

arXiv.org Machine Learning

1903.02958

Country: Asia (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback